AITopics | user request

Collaborating Authors

user request

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Art of Saying No: Contextual Noncompliance in Language Models

Neural Information Processing SystemsMar-20-2026, 18:41:23 GMT

Chat-based language models are designed to be helpful, yet they should not comply with every user request. While most existing work primarily focuses on refusal of ``unsafe'' queries, we posit that the scope of noncompliance should be broadened. We introduce a comprehensive taxonomy of contextual noncompliance describing when and how models should comply with user requests.

artificial intelligence, machine learning, natural language, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.48)
Information Technology > Artificial Intelligence > Machine Learning (0.36)

Add feedback

098d1bd3eb6156a4c2f834563cdcf617-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 22:57:33 GMT

graph, llm, task planning, (15 more...)

Neural Information Processing Systems

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.92)
Workflow (0.67)

Industry: Information Technology > Security & Privacy (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face

Neural Information Processing SystemsFeb-14-2026, 23:37:14 GMT

Solving complicated AI tasks with different domains and modalities is a key step toward artificial general intelligence. While there are numerous AI models available for various domains and modalities, they cannot handle complicated AI tasks autonomously. Considering large language models (LLMs) have exhibited exceptional abilities in language understanding, generation, interaction, and reasoning, we advocate that LLMs could act as a controller to manage existing AI models to solve complicated AI tasks, with language serving as a generic interface to empower this. Based on this philosophy, we present HuggingGPT, an LLM-powered agent that leverages LLMs (e.g., ChatGPT) to connect various AI models in machine learning communities (e.g., Hugging Face) to solve AI tasks. Specifically, we use ChatGPT to conduct task planning when receiving a user request, select models according to their function descriptions available in Hugging Face, execute each subtask with the selected AI model, and summarize the response according to the execution results. By leveraging the strong language capability of ChatGPT and abundant AI models in Hugging Face, HuggingGPT can tackle a wide range of sophisticated AI tasks spanning different modalities and domains and achieve impressive results in language, vision, speech, and other challenging tasks, which paves a new way towards the realization of artificial general intelligence.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: Asia > China > Zhejiang Province (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PlanetServe: A Decentralized, Scalable, and Privacy-Preserving Overlay for Democratizing Large Language Model Serving

Fang, Fei, Hua, Yifan, Wang, Shengze, Zhou, Ruilin, Liu, Yi, Qian, Chen, Zhang, Xiaoxue

arXiv.org Artificial IntelligenceDec-12-2025

While significant progress has been made in research and development on open-source and cost-efficient large-language models (LLMs), serving scalability remains a critical challenge, particularly for small organizations and individuals seeking to deploy and test their LLM innovations. Inspired by peer-to-peer networks that leverage decentralized overlay nodes to increase throughput and availability, we propose GenTorrent, an LLM serving overlay that harnesses computing resources from decentralized contributors. We identify four key research problems inherent to enabling such a decentralized infrastructure: 1) overlay network organization; 2) LLM communication privacy; 3) overlay forwarding for resource efficiency; and 4) verification of serving quality. This work presents the first systematic study of these fundamental problems in the context of decentralized LLM serving. Evaluation results from a prototype implemented on a set of decentralized nodes demonstrate that GenTorrent achieves a latency reduction of over 50% compared to the baseline design without overlay forwarding. Furthermore, the security features introduce minimal overhead to serving latency and throughput. We believe this work pioneers a new direction for democratizing and scaling future AI serving capabilities.

large language model, machine learning, node, (19 more...)

arXiv.org Artificial Intelligence

2504.20101

Country:

South America (0.04)
North America > United States > Nevada > Washoe County > Reno (0.04)
North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Constrained Network Slice Assignment via Large Language Models

Sudhakara, Sagar, Rajak, Pankaj

arXiv.org Artificial IntelligenceDec-2-2025

Modern networks support network slicing, which partitions physical infrastructure into virtual slices tailored to different service requirements (for example, high bandwidth or low latency). Optimally allocating users to slices is a constrained optimization problem that traditionally requires complex algorithms. In this paper, we explore the use of Large Language Models (LLMs) to tackle radio resource allocation for network slicing. We focus on two approaches: (1) using an LLM in a zero-shot setting to directly assign user service requests to slices, and (2) formulating an integer programming model where the LLM provides semantic insight by estimating similarity between requests. Our experiments show that an LLM, even with zero-shot prompting, can produce a reasonable first draft of slice assignments, although it may violate some capacity or latency constraints. We then incorporate the LLM's understanding of service requirements into an optimization solver to generate an improved allocation. The results demonstrate that LLM-guided grouping of requests, based on minimal textual input, achieves performance comparable to traditional methods that use detailed numerical data, in terms of resource utilization and slice isolation. While the LLM alone does not perfectly satisfy all constraints, it significantly reduces the search space and, when combined with exact solvers, provides a promising approach for efficient 5G network slicing resource allocation.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2512.0004

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.48)

Industry:

Telecommunications (0.67)
Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

HuggingR$^{4}$: A Progressive Reasoning Framework for Discovering Optimal Model Companions

Ma, Shaoyin, Song, Jie, Wang, Huiqiong, Sun, Li, Song, Mingli

arXiv.org Artificial IntelligenceNov-25-2025

Large Language Models (LLMs) have made remarkable progress in their ability to interact with external interfaces. Selecting reasonable external interfaces has thus become a crucial step in constructing LLM agents. In contrast to invoking API tools, directly calling AI models across different modalities from the community (e.g., HuggingFace) poses challenges due to the vast scale (> 10k), metadata gaps, and unstructured descriptions. Current methods for model selection often involve incorporating entire model descriptions into prompts, resulting in prompt bloat, wastage of tokens and limited scalability. To address these issues, we propose HuggingR$^4$, a novel framework that combines Reasoning, Retrieval, Refinement, and Reflection, to efficiently select models. Specifically, We first perform multiple rounds of reasoning and retrieval to get a coarse list of candidate models. Then, we conduct fine-grained refinement by analyzing candidate model descriptions, followed by reflection to assess results and determine if retrieval scope expansion is necessary. This method reduces token consumption considerably by decoupling user query processing from complex model description handling. Through a pre-established vector database, complex model descriptions are stored externally and retrieved on-demand, allowing the LLM to concentrate on interpreting user intent while accessing only relevant candidate models without prompt bloat. In the absence of standardized benchmarks, we construct a multimodal human-annotated dataset comprising 14,399 user requests across 37 tasks and conduct a thorough evaluation. HuggingR$^4$ attains a workability rate of 92.03% and a reasonability rate of 82.46%, surpassing existing method by 26.51% and 33.25% respectively on GPT-4o-mini.

huggingr 4, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2511.18715

Genre:

Workflow (1.00)
Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Procedural Knowledge Improves Agentic LLM Workflows

Hsiao, Vincent, Roberts, Mark, Smith, Leslie

arXiv.org Artificial IntelligenceNov-12-2025

Large language models (LLMs) often struggle when performing agentic tasks without substantial tool support, prom-pt engineering, or fine tuning. Despite research showing that domain-dependent, procedural knowledge can dramatically increase planning efficiency, little work evaluates its potential for improving LLM performance on agentic tasks that may require implicit planning. We formalize, implement, and evaluate an agentic LLM workflow that leverages procedural knowledge in the form of a hierarchical task network (HTN). Empirical results of our implementation show that hand-coded HTNs can dramatically improve LLM performance on agentic tasks, and using HTNs can boost a 20b or 70b parameter LLM to outperform a much larger 120b parameter LLM baseline. Furthermore, LLM-created HTNs improve overall performance, though less so. The results suggest that leveraging expertise--from humans, documents, or LLMs--to curate procedural knowledge will become another important tool for improving LLM workflows.

large language model, machine learning, specification, (20 more...)

arXiv.org Artificial Intelligence

2511.07568

Country: North America > United States > California (0.28)

Genre:

Workflow (1.00)
Research Report > New Finding (0.48)

Industry:

Consumer Products & Services > Travel (0.68)
Government > Military (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

EU-Agent-Bench: Measuring Illegal Behavior of LLM Agents Under EU Law

Lichkovski, Ilija, Müller, Alexander, Ibrahim, Mariam, Mhundwa, Tiwai

arXiv.org Artificial IntelligenceOct-27-2025

Large language models (LLMs) are increasingly deployed as agents in various contexts by providing tools at their disposal. However, LLM agents can exhibit unpredictable behaviors, including taking undesirable and/or unsafe actions. In order to measure the latent propensity of LLM agents for taking illegal actions under an EU legislative context, we introduce EU-Agent-Bench, a verifiable human-curated benchmark that evaluates an agent's alignment with EU legal norms in situations where benign user inputs could lead to unlawful actions. Our benchmark spans scenarios across several categories, including data protection, bias/discrimination, and scientific integrity, with each user request allowing for both compliant and non-compliant execution of the requested actions. Comparing the model's function calls against a rubric exhaustively supported by citations of the relevant legislature, we evaluate the legal compliance of frontier LLMs, and furthermore investigate the compliance effect of providing the relevant legislative excerpts in the agent's system prompt along with explicit instructions to comply. We release a public preview set for the research community, while holding out a private test set to prevent data contamination in evaluating upcoming models. We encourage future work extending agentic safety benchmarks to different legal jurisdictions and to multi-turn and multilingual interactions. We release our code on \href{https://github.com/ilijalichkovski/eu-agent-bench}{this URL}.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2510.21524

Country:

Europe (0.51)
Asia > Middle East > UAE (0.28)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > Europe Government (0.51)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

TripScore: Benchmarking and rewarding real-world travel planning with fine-grained evaluation

Qu, Yincen, Xiao, Huan, Li, Feng, Li, Gregory, Zhou, Hui, Dai, Xiangying, Dai, Xiaoru

arXiv.org Artificial IntelligenceOct-17-2025

Travel planning is a valuable yet complex task that poses significant challenges even for advanced large language models (LLMs). While recent benchmarks have advanced in evaluating LLMs' planning capabilities, they often fall short in evaluating feasibility, reliability, and engagement of travel plans. We introduce a comprehensive benchmark for travel planning that unifies fine-grained criteria into a single reward, enabling direct comparison of plan quality and seamless integration with reinforcement learning (RL). Our evaluator achieves moderate agreement with travel-expert annotations (60.75%) and outperforms multiple LLM-as-judge baselines. We further release a large-scale dataset of 4,870 queries including 219 real-world, free-form requests for generalization to authentic user intent. Using this benchmark, we conduct extensive experiments across diverse methods and LLMs, including test-time computation, neuro-symbolic approaches, supervised fine-tuning, and RL via GRPO. Across base models, RL generally improves itinerary feasibility over prompt-only and supervised baselines, yielding higher unified reward scores.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2510.09011

Country: